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ESTIMATION  FOR  DIFFUSION  PROCESSES 
UNDER  MISSPECIFIED  MODELS 
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ABSTRACT 

✓ 

The  asymptotic  behavior  of  the  likelihood  estimator  of  a  para¬ 

meter  in  the  drift  term  of  a  stationary  ergodic  diffusion  process  is  studied 
under  conditions  in  which  the  true  drift  function  and  true  noise  function 
do  not  coincide  with  those  specified  by  the  parametric  model. 
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1.  Introduction. 

Consider  the  problem  of  estimating  the  drift  function  b(x)  of  a  sta¬ 
tionary  diffusion  process  (X()  given  by 

dXt  -  b(Xt)dt  ♦  o(Xt)dKt,  t  a  0. 

where  the  process  is  observed  over  [0,  T).  The  method  of  maximum  likelihood 
can  be  used  if  b(x)  is  assumed  to  have  a  parametric  form  f(x,  8),  8  e  0. 

Brown  and  Hewitt  (1975),  Kutoyants  (1977),  Lanska  (1979)  and  Prakasa  Rao  and 
Rubin  (1981)  have  shown  that  the  maximum  likelihood  estimator  of  8  is  con¬ 
sistent  and  asymptotically  normal.  Nonparametric  methods  of  estimating  b 
have  been  developed  by  Banon  (1978)  and  Geman  (1980). 

Suppose  that  a  parametric  model  for  the  process  (X^)  is  given  by 

dXt  -  f(Xt,  8)dt  ♦  y(xt)dwt,  no. 

This  paper  studies  the  asymptotic  behavior  of  the  maximum  likelihood  estimator 
of  6  under  departures  of  the  true  drift  function  b(x)  or  true  noise  function 
o(x)  from  those  specified  by  the  parametric  model. 

The  need  for  such  analysis  stems  from  the  desirability  of  using  esti¬ 
mators  that  are  robust  under  small  departures  from  the  underlying  model. 

This  kind  of  analysis  is  familiar  in  other  settings,  for  example  Huber  (1967), 
White  (1981)  and  Berger  and  Langberg  (1981). 

In  Section  2  it  is  shown  that  the  maximum  likelihood  estimator  converges 

2 

almost  surely  to  a  parameter  0*  such  that  f(x,  6*)  minimizes  an  L  distance 

2 

of  the  parametric  family  from  the  true  drift  function.  The  L  distance  is 
defined  with  respect  to  the  measure  Y-*(x)dv(x),  where  v  is  the  stationary 
distribution  of  the  process.  Asymptotic  normality  is  also  established.  In 
Section  3  we  discuss  a  way  of  estimating  the  difference  in  goodness  of  fit 
of  two  separate  parametric  families  to  the  true  drift  function.  Examples 
are  given  at  the  end  of  each  section. 
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2.  Maximum  likelihood  estimation  under  ndsspccified  Models. 

Let  (Xt,  tiO)  be  a  stationary,  ergodic  process  which  is  assuaed  to  be 
the  unique  solution  of  the  stochastic  differential  equation 

(2.1)  dXt  ■  b(Xt)dt  ♦  0(Xt)dHt,  t  *  0 

where  is  distributed  according  to  the  stationary  distribution  of  the 
process,  b  and  a  are  unknown  measurable  functions  and  (W^,  t  i  0)  is  a  standard 
Wiener  process.  Assume  that  (X()  has  inaccessible  boundaries  on  the  state 
space  (-«*,  »).  Using  the  notation  of  Mandl  (1968),  let 

X  X 

B(x)  »  2  f  dy,  p(x)  »  /  exp(-B(y))dy, 

0  o*ty)  0 

n(x)  .  2  f  dy, 

0  o(y) 

where  the  integrals  are  assumed  to  exist.  Provided  that  »(♦•)  <  •  and 
m(-®)  >  the  stationary  distribution,  denoted  v,  has  distribution  function 
M"*  i(x)  where  M  ■  -  m(-«). 

Suppose  a  parametric  model  is  used  to  estimate  the  drift  function  b(x) 
by  the  method  of  maximum  likelihood.  Let  6  denote  a  closed  bounded  interval. 

A  family  of  measurable  drift  functions  (f(x,0),  0e  6)  and  a  measurable  noise 
function  y(x)  >  0  are  provided  and  inference  is  based  on  the  model 

(2.2)  dXt  -  f(Xt,  0)dt  ♦  y(Xt)dWt,  t  2  0. 

T  T 

The  process  (Xt)  is  observed  over  [0,  T].  Let  y*  and  y  denote  the  measures 
induced  on  C[0,  T]  by  process  satisfying  (2.2)  and  the  process 

dYt  -  Y(Yt)dWt,  t  2  0 


Y0*  V 


respectively.  Under  conditions  given  in  Liptser  and  Shiryayev  (1977, 

T  T 

Theorem  7.19)  it  follows  that  «  u  for  all  6  e  0  and  the  log  likelihood 
function  1^(0)  «  log[djig/dyT](X)  is  given  by 

T  f(Xt,0)  T  [*(X  ,0)12  T 

<JS>  ¥«  *  /  -7^7  "t-'l  |7Ty-J  dt-  * >• 

A  maximum  likelihood  estimator  calculated  from  1^(6)  is  denoted  §T. 


Assume  that  E 


bcy-ffXQ.ef]' 

r(*o>  " 


<  ®  for  all  0  c  0-  and,  as  a  function  of 


0,  has  a  unique  minimum  at  6*  €  6.  The  following  results  describe  the 
asymptotic  behavior  of  9^  when  the  observed  process  satisfies  (2.1).  The 
conditions  are  stated  later,  g',  g"  denote  first  and  second  partial  deri¬ 
vatives  of  a  function  g(x,  0)  with  respect  to  0. 

Theorem  2.1.  Under  conditions  (C1)-(C3),  §T  ■*  0*  a.s.  as  T  +  «. 

t,  „  V 

Theorem  2.2.  Under  conditions  (C1)-(C7),  T  (0T-0*)  -►  N(0,  I)  where 


2M  /  g’(y,0*)  /  /  g,(i.0*)dm(z)dp(s)dm(y) 


(2.4) 


{/  g”(x, 

'-CD 


0*)dm(x) 


(2.5) 


«“•  •>  *  M  *  d2«  ^ 


Theorem  2.3.  In  the  special  case  that  the  drift  function  has  been  correctly 
specified,  i.e.  b(x)  ■  f(x,  0Q)  foT  some  0Q  c  0,  then  (whether  or  not  the 
noise  function  has  been  correctly  specified) 

(i)  0j  ♦  0q  a.s.  as  T  +  »  under  conditions  (C1)-(C3); 
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j,  .  V 

(ii)  T*(8T-e0)  -►  N(0,  V)  under  conditions  (C1)-(C9),  where 


(2.6) 


Remarks. 

(a)  The  parameter  8*  can  often  be  determined  from  the  moments  of  the 
stationary  distribution.  In  general  it  is  difficult  to  evaluate  Z  unless 
the  drift  function  has  been  correctly  specified  (as  in  Theorem  2.3).  Some 
examples  in  which  e*  and  Z  are  explicitly  calculated  are  given  at  the  end 
of  this  section. 

(b)  When  both  drift  and  noise  functions  have  been  correctly  specified 
V  reduces  to  the  previously  known  formula 


The  Cauchy- Schwarz  inequality  yields  V  a  VQ. 

(c)  Theorem  2.3  (i)  can  be  interpreted  as  a  robustness  result  for  the 
maximum  likelihood  estimator;  §T  remains  consistent  under  misspecifications 
of  the  noise  function. 

i  (d)  Lanska  (1979)  introduced  a  minimum  contrast  estimator  8^  which 

minimizes 

T 

/  h(Xt,  8)dt 

0  Z 


where 
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h(x' e)  ■  *  y2(x) 


If  h  satisfies  the  conditions  (C1)-(C7)  in  place  of  g  and  the  function 

Eh(Xnt  8)  has  a  unique  minimum  at  8^  then  using  similar  proofs  to 

”  l  -  P 

Theorems  2.1  and  2.2,  8T  8j  a.s.  and  T,(eT-81)  -►  N(0,  ZJ,  where  Ej  is 

given  by  (2.4)  with  g  replaced  by  h.  Even  when  the  drift  function  is  cor¬ 
rectly  specified,  it  is  possible  for  0j  *  80  when  the  noise  function  is 
misspecified.  so  that  eT  can  fail  to  be  a  consistent  estimator  of  eQ  while 
8t  remains  consistent. 

Conditions . 

(Cl)  IfCx.e^-fCx.epI  £  J(x)<KVe2).  x  c  R’  Bi*  e2  €  e'  where 


<  <*>  and  lin  4»(»)  *  0. 
a-*0 


(C2)  E 


Sf 


<  -  and  |f (x,8) {  S  K(x),  x  c  R,  8  e  0,  where 


(C3)  f(x,  8)  is  continuous  in  (x,  8)  and  differentiable  with  respect 
to  8.  There  exists  a  >  0  such  that  |f '  (x,6j)-f '  (x,B2)  |  S  c(x) |8j-82Ja, 


X  €  R,  0J,  e2  €  e,  where  E 


o(x0)c(x0)T 

.  Ay  . 


(C4)  f(x,  8)y"Z(x)  has  a  continuous  first  partial  derivative  with 
respect  to  x  for  each  8  e  6. 

(C5)  lim  f(x,  8)y"2(x)exp  B(x)  ■  0,  V  8  c  0. 


-  6  - 


(C6)  The  partial  derivatives  g',  g",  G',  G"  exist  and  are  continuous 
in  (x,  8),  where  G(x,  8)  is  defined  by 


(2.7) 


G(x,  8)  -  /  tt&ZL  dy. 
0  y  z(y) 


(C7)  |g,,(x,el)-g”(x,e2)j  i  IKxMej-ej),  x  e  R,  e2  e  9,  Where 

E  U(Xn)  <  •  and  lim  ♦  («)  -  0. 
u  o-K) 

-2  ^  -2 

(C8)  lira  f*(x,8  )v  *(x)  {exp  B(x)]  /  f(s.eo)Y  (s)ds  «  0. 

x+*»  0 

(C9)  lim  f"(x,80)Y’2(x)exp  B(x)  ■  0. 
x-*±- 

(CIO)  Condition  (C7)  with  g‘  in  place  of  g". 


Proof  of  Theorem  2.1.  The  proof  of  strong  consistency  of  §T  in  the  correctly 
specified  case  given  by  Prakasa  Rao  and  Rubin  (1981,  Theorem  4.1)  needs  only 
minor  modifications  to  show  that  §T  8*  a.s.  as  T  -*•  »  in  the  misspecified 
case.  The  details  are  given  here  for  completeness.  From  (2.1)  and  (2.3) 
ye)  can  be  written 


(2.6)  ye) 


Tf?(Xt,8)-b(X jf 

il  W--J 


dt  ♦ 


T 

H/ 

0 


IW 


n  2 


TO 


T  f(X  ,8)o(X  ) 

dt  ♦  / - K - —  dWf 

0  Y2(Xt) 


t[*(x  ,e)-b(x  fr 

Denote  ye)  »  /  - yHO -  dt* 


For  8j,  82  €  0, 


IW-W  1 1 


t  lf(xt,e1).f(xt.e2) U|f(xt,e1) l4.|f(xt>e2)  1) 


y2(xt) 


dt 


2  /  >b(xt)Hmt,e1)-f(xt,e2)  j 


o  Y‘(xt) 

T  J(Xt) (K(Xt)*b(X  )) 

i  2*(8.-82)  /  — i - dt. 

1  1  o  YZ(Xt) 


using  conditions  (Cl)  and  (C2).  By  the  ergodic  theorem 

a.s. 

as  T  using  conditions  (Cl)  and  (C2)  again.  Thus,  there  exists  a  r.v. 

C*,  which  is  finite  a.s.  and  does  not  depend  on  6,  such  that 

llT(e1)-IT(e2)|  s  cn^-e^,  for  ail  e2  «  e,  t  a  o. 

Similarly,  using  condition  (C2)  again,  it  follows  that  there  exists  a  r.v. 

D*,  which  is  finite  a.s.  and  does  not  depend  on  6,  such  that  IT(6)  s  D*T, 
for  all  6  (  8,  T  2  0.  Thus,  {-p  T  a  0)  is  equi continuous  and  uniformly 

bounded  a.s.  as  a  fully  of  functions  of  8.  By  the  Arzela-Ascoli  theorem 
this  family  is  relatively  compact  (a.s.)  in  the  space  of  continuous  functions 
on  0  provided  with  the  supremua  norm.  Therefore,  by  the  ergodic  theorem 


uniformly  in  8  «  6  as  T  •*  •. 

Now  consider  the  second  term  in  (2.6).  By  condition  (C2)  and  the 
ergodic  theorem 


Next,  using  Lemma  4.3  of  Prakasa  Rao  and  Rubin  (1981)  it  follows  that, 
under  conditions  (C1)-(C3), 

.  T  f(Xt.8)o(X  ) 

±  f - i_ - !_  &  o,  uniformly  in  8  c  0  as  T  ♦  •. 

T  0  yZ(X  )  1 


T  J(Xt)(K(Xt)*b(Xt)) 
J  - j - 

0  YZ(Xt) 


J 

-*>  E  - 


j(X0)(K(X0)*b(X0))1 


Y  CXQ) 


(2.7) 


uniformly  in  8  £  6  m  T  ♦  ®.  Since  the  r.h.s.  of  (2.7)  has  a  unique  maximum 
at  0*  e  0  and  ©T  maximizes  ^  ^(0),  it  is  easily  proved  that  §T  -*•  0*  a.s. 
as  T  ♦  «.  □ 

Proof  of  Theorem  2.2.  The  approach  used  by  Prakasa  Rao  and  Rubin  (1981) 
to  find  the  asymptotic  distribution  of  0^  in  the  correctly  specified  case 
does  not  extend  to  the  misspecified  case.  Rather,  the  proof  of  this  theorem 
uses  the  technique,  introduced  by  lanska  (1979),  of  expressing  t^,(0)  in 
terms  of  Lebesgue  integrals. 

The  function  G(x,  0)  defined  in  (2.7)  has  a  continuous  second  partial 
derivative  with  respect  to  x  for  each  0  e  0  by  condition  (C4).  Applying 
Ito's  formula,  it  follows  that 


T^(X  )f(X  ,0)  2 

G(X_,  0)  »  G(X  ,  0)  ♦  / - K - i - ♦  ho  iX  )G"(X  ,0)  dt 

oL  y  cx )  1  1  _ 


T  o(X  )f(X  ,0) 

♦  /  — K - 5 - dW 

o  YZ(X  )  1 


Then,  using  (2.1)  and  (2.3), 


T  f(X  ,0)b(X  )  T  f(X  ,0)e(X  )  Tp(X.,0)T 

*t(0)  •  /  — S — ~ dt  ♦  /  — S — —  dwt  -  ^  ram 

T  0  Y  (X  )  0  YZ(X  )  *  0LY(VJ 


dt 


(2.8) 


G(Xt,0)  -  G(Xo,0)  -  h/g(Xt,0)dt, 


i 

I 

i 
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where  g  is  defined  in  (2.S).  Expand  i^(8)  about  6^, 

*f(e*)  »  tj.te.j)  ♦  where  |ST-e*|  s  |eT-e*|. 

Consider 

T 

T"\j.(e*)  -  T~\g* (Xj.,8*)  -  G'(X0,e*))  ♦  T_%  /  g'(Xt,e*)dt. 

Using  the  stationarity  of  (X„) , 

t“\g*  1x^,6*)  -  G'(x0,e*))  5  o  as  t 
Using  integration  by  parts  and  condition  (CS)  it  can  be  shown  that 

(2.9) 

and  since  the  right  hand  side  of  this  expression  is  minimized  at  6*  e  6, 
it  follows  that  E  g(XQ,  e)  is  minimized  at  8*  e  9  and  E  g' (XQ,  8*)  ■  0. 
Then,  by  Mandl  (1968,  p.  94) 

T 

T‘%  /  g'(Xt,  6*)  $  N(0,  A), 

0  x 

where 

,  •  Os 

/  g'(y.8#)  /  /  g' (z,8*)dm(z)dp(s)dm(y) . 

-•»  y  mm 

.L  P 

Thus  T  t^.(8*)  ■*  N(0,  A) .  By  the  ergodic  theorem  and  condition  (C6) 

£  *p#)  ♦  E  g"(Xg,  8*). 
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Next,  using  conditions  (C6),  (C7)  and  the  fact  that  5^.  6*  a.s. 

|(^(§T)  -  t"(0*))  So,  as  T  +  •. 

Thus 

A 

f  i^(§T)  5  E  g"(X0,  e*)  -  i  /  g”(x,0*)dm(x), 

•A 

as  T  <*>.  We  conclude  that 

T*(§T-0*)  l  N(0,  E), 

where  E  is  given  in  (2.4).  □ 

Proof  of  Theorem  2.3.  Suppose  that  b(x)  =  f(x,  0Q) ,  where  0Q  e  0.  Then 
0*  «  0Q  and  (a)  follows  directly  from  Theorem  2.1.  The  proof  of  (b)  consists 
in  showing  that  E  in  (2.4)  reduces  to  V  given  in  (2.6).  Note  that 

g'(x,  0Q)  «  2b(x)f’(x,0o)Y'2(x)  ♦  o2(x)  — [f' (x,e0)Y-2(x)i. 

Using  integration  by  parts, 

$ 

/  a2(z)  (f(z,90)7_2(z)ldm(z) 

«a> 

*  2f'(s,90)Y”2(s)exp  B(s)  -2  /  b(z)f' (z,eo)Y*2(z)dm(z) , 

so  that 

s  3 

/  g’  (z,0o)dm(z)  -  2f’(s,0o)Y"‘i(s)exp  B(s), 

•A 


and 
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I  I  g’ (*,en)dm(x)dp(s)  -  2  /  f»(s,eft)Y"2(s)d*. 


Using  integration  by  parts  with  condition  (C8) , 


OB  0 

Jo2(y)  4[f’(y,e0)Y'2(y)]  /  f'(s,e0)Y‘2(s)ds  dm(y) 

•n  *  V 


-  -2/f»(y,en)Y“2(y)  ^-1/  f'(s,e0)Y“z(s)ds  exp  B(y)jdy 


"O'1  ay  •*  v*,vo' 


®  P'^'Vlfotyfl2  1  _2  9  -2 

/  y^-  -  jjgjj  d®(y)  -  2  /  b(y)f*  (y,e0)Y  £{y)  /  f  (s,eo)Y  *(s)d s  dm(y) 


It  follows  that  the  numerator  of  £  reduces  to 


iFStf 


dn(y) . 


Now  consider  the  denominator  of  £. 


P’C^Vl  -2  2  3  -2 

g”(x,  e0)  -  2  yW-°-  ♦  2f(x,e0)f»(x,e0)Y  nx)  ♦  oz(x)  ~[f”(x.e0)Y  *(x)] 


Using  integration  by  parts  and  condition  (C9) 


OB  00 

/  o2(x)  ~[r,(x,e0)Y"2(x)]dm(x)  -  -2  /  f(x,e0)f"(x.90)Y~2Cx)dm(x) 


and  it  follows  that 


/  gM(x,en)dm(x)  «  2 


dm(x). 


This  completes  the  proof  of  the  theorem. 
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1.  This  example  presents  a  misspecified  drift  function  but  correctly 
specified  noise  function.  Suppose  that  the  observed  process  satisfies 

dXt  -  -Xtdt  ♦  vT<mt, 

where  XQ  has  a  N(0,  1)  distribution,  the  stationary  distribution  of  (Xt). 
Estimates  are  calculated  from  the  parametric  model 

dxt  ■  -exjdt  ♦  /2dwt. 

3  2 

The  parameter  6*  which  minimizes  E(Xq-0Xo)  is  given  by 


*0  1_ 
J  "  S  * 


0  ft  2 

By  Theorem  2.1,  flT  -*■  j  a.s.  as  T  -*•  *.  We  also  have  g(x,  0)  *  W  x  -  30x  , 
so  that  g'(x,  0*)  «  ^  x  -  3x  .  Using  repeated  integration  by  parts 

/  g,(*»0#)dm(z}  ■  /  (i  z6- 3z2)e’*2/2dz  -  -4  sS*s5)e“s2/2 


so  that 


/  /  g’(z,e*)dm(z)dp(s)  *  jfr  y6  ♦  \  y4 


/  g'Cy.O*)  /  /  g*  (z,8*)dm(z)dp(s)dv(y)  •  /  (i  y6  -  3y2) (X  y6  ♦  j  y4)dv(y) 

-oo  y  -oo  •• 


.  1  py12  1  PY10  -_8  3  pif6 

I5o  Exo  3o  EXo  "io  Exo  "  4  EXo 


/  g"(x,0*)dv(x)  -  ExJ  *  15. 


Thus  E  «  b  .84,  and  by  Theorem  2.2  we  have  T*(0T-i)  2  n(0,  .84). 

(IS)2  T 

The  asymptotic  variance  of  §T  is  less  than  in  the  correctly  specified  case 
for  which  T*(§T-1)  2  N(0,  1). 


2.  Our  second  example  has  a  correctly  specified  drift  function  and 
a  nisspecified  noise  function.  The  observed  process  is  the  same  as  in  the 
first  example  but  the  parametric  model  is  given  by 


dXt  -  -0Xtdt  ♦ 


h  «  V 

Theorem  2.3  yields  T  (ty-l)  N(0,  2.75).  The  asymptotic  variance  of  0T 
has  almost  tripled  due  to  the  misspecified  noise. 


3.  Discriminating  between  separate  families  of  drift  functions. 

Let  (Xt)  satisfy  (2.1)  and  assume  throughout  this  section  o(x)  =  1. 
Suppose  that  two  parametric  models  for  this  process  have  been  suggested. 

It  is  required  to  decide  in  favor  of  the  model  which  best  fits  the  observed 
trajectory  (Xt>  0  St  s  T). 

Let  {fj(x,0):  6  e  6),  (f2(x,4):  4  e  ♦}  be  distinct  families  of  drift 
functions,  where  0,  9  are  closed  bounded  intervals.  A  reasonable  way  to 
compare  the  goodness  of  fit  of  these  families  to  the  true  drift  function 
b(x)  is  to  estimate  the  parameter 
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a  -  EtfjCXo.^-bO^)]2  -  EffjCXo.e^-MXg)]2. 

In  this  section  we  introduce  an  estimator  for  A.  The  noise  function  is 
assumed  to  be  correctly  specified;  that  is  y(x)  =  1. 

Let  A^(8),  i^2)  (♦)  denote  the  log  likelihoods  for  the  two  models. 
Define 

-  42)aT)b 

Given  that  f^,  f 2  satisfy  conditions  (C1)-(C3),  it  follows  from  (2.7)  in 
the  proof  of  Theorem  2.1  that  A^  A  a.s.  as  T  «.  The  following  result 
shows  that  A^  is  asymptotically  normal.  Conditions  from  section  2  are  used 
interchangeably  between  the  two  families  of  drift  functions  indexed  by  0 
and  e. 


Theorem  3.1.  Suppose  that  fJf  f2  satisfy  conditions  (C1)-(C6)  and  (CIO), 
where  gj,  g2  are  given  by  (2.5)  with  f  ■  fj,  f2  respectively,  yO)  =  1. 

L  .  Q 

Then  T’^-A)  •*>  N(0,  I2),  where 

■  Os 

l2  *  /  U2(y»^_gi(y,e#5*A]  /  /  lg2(*.4*)-g1U.0*)-A]dm(i)dp(s)dm(y). 

»•  y  -os 


Proof.  From  (2.8) 


T^-A)  •  T~*  /(g2(Xt,;T)-g1(Xt,0T)-A]dt 


(3.1) 


♦  2T  ’[G1(XT,0T)-G1(Xo,8T)-G2(Xr.AT)*G2(Xo,AT)], 


The  second  term  on  the  right  hand  side  of  (3.1)  converges  to  0  in  probability. 
The  first  term  is  written 
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T"*  /  Ig2(Xt.^*)  -  g1(Xt,8*)  -  A]dt 
.  T 

♦  T  /  -82(V^ldt 

,  T 

♦  T“’  /  Ig1(xt,e*).gI(xtJT)]dt 

■  ^  ♦  BT  ♦  Cj.. 

From  (2.9), 

Bfg^Xo.^j-gjCXQ.e*)-*)  -  0 
fj 

so,  by  Mandl  (1968,  p.  94),  A^.  N(0,  tj) .  Next,  consider  Cj..  Expanding 

gj  in  a  neighborhood  around  6*, 

gjCx.8)  *  gl(x,e*)  ♦  (e-0*)gj(x,e), 

where  |S-0*|  s  |e-6*|.  This  gives 

T 

CT  -  TH(§T-e»)  •  i  /  gj(xt,eT)dt. 

But, 

T  T 

f  l  8i(xt»Vdt  *  f  /  «iCXt,e*)dt 

(3.2) 

1  T 

♦  T  /  It’i&t' fy  -  g{(xt,8*))dt. 

By  the  proof  of  Theorem  2.2,  E  gJCXg.O*)  ■  0,  so  by  the  ergodic  theorem  the 
first  term  in  (3.2)  converges  to  0  a.s.  as  T  -*■  •.  From  condition  (CIO), 
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the  second  term  in  (3.2)  is  bounded  above  by 


i  T 

4  /  U(X  )dt, 
1  0  Z 


which  converges  to  0  a.s.  since  |§T-6*|  s  |§T-0*|,  §T  -*•  0*  a.s.  and 
T 

Y  j  U(Xt)dt  aVs-  EU(X0)  as  T  4  ».  Thus  Cj  ♦  0  a.s.  and  similarly  BT  0 
This  completes  the  proof  of  the  theorem. 


a.s. 


□ 


Example. 

Consider  the  two  models 


(3.3) 

dX  <=  -ex  dt  ♦  dW  , 

1  X  v 

(3.4) 

dXt  =  -Axjdt  ♦  dWt, 

and  suppose  that  the  observed  process  satisfies  (3.3)  with  0  »  eQ  >  0.  Some 
involved  but  routine  calculations  give  that  A  *  .2 0Q  and  E 2  *  .64eQ  *  9.360" 
Note  that  Ej  *  as  ®o  ^  The  Poor  Performance  of  A^.  for  small  0Q  is  to 
be  expected  since,  as  8Q  -*■  0,  the  drift  function  has  less  effect  on  the 
dynamics  of  the  process  so  it  is  harder  to  discriminate  between  the  two 
models. 
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